17 research outputs found
Adversarial learning for distant supervised relation extraction
Recently, many researchers have concentrated on using neural networks to learn features for Distant Supervised Relation Extraction (DSRE). These approaches generally use a softmax classifier with cross-entropy loss, which inevitably brings the noise of artificial class NA into classification process. To address the shortcoming, the classifier with ranking loss is employed to DSRE. Uniformly randomly selecting a relation or heuristically selecting the highest score among all incorrect relations are two common methods for generating a negative class in the ranking loss function. However, the majority of the generated negative class can be easily discriminated from positive class and will contribute little towards the training. Inspired by Generative Adversarial Networks (GANs), we use a neural network as the negative class generator to assist the training of our desired model, which acts as the discriminator in GANs. Through the alternating optimization of generator and discriminator, the generator is learning to produce more and more discriminable negative classes and the discriminator has to become better as well. This framework is independent of the concrete form of generator and discriminator. In this paper, we use a two layers fully-connected neural network as the generator and the Piecewise Convolutional Neural Networks (PCNNs) as the discriminator. Experiment results show that our proposed GAN-based method is effective and performs better than state-of-the-art methods
MICK: A Meta-Learning Framework for Few-shot Relation Classification with Small Training Data
Few-shot relation classification seeks to classify incoming query instances
after meeting only few support instances. This ability is gained by training
with large amount of in-domain annotated data. In this paper, we tackle an even
harder problem by further limiting the amount of data available at training
time. We propose a few-shot learning framework for relation classification,
which is particularly powerful when the training data is very small. In this
framework, models not only strive to classify query instances, but also seek
underlying knowledge about the support instances to obtain better instance
representations. The framework also includes a method for aggregating
cross-domain knowledge into models by open-source task enrichment.
Additionally, we construct a brand new dataset: the TinyRel-CM dataset, a
few-shot relation classification dataset in health domain with purposely small
training data and challenging relation classes. Experimental results
demonstrate that our framework brings performance gains for most underlying
classification models, outperforms the state-of-the-art results given small
training data, and achieves competitive results with sufficiently large
training data
Relation Extraction with Self-determined Graph Convolutional Network
Relation Extraction is a way of obtaining the semantic relationship between
entities in text. The state-of-the-art methods use linguistic tools to build a
graph for the text in which the entities appear and then a Graph Convolutional
Network (GCN) is employed to encode the pre-built graphs. Although their
performance is promising, the reliance on linguistic tools results in a non
end-to-end process. In this work, we propose a novel model, the Self-determined
Graph Convolutional Network (SGCN), which determines a weighted graph using a
self-attention mechanism, rather using any linguistic tool. Then, the
self-determined graph is encoded using a GCN. We test our model on the TACRED
dataset and achieve the state-of-the-art result. Our experiments show that SGCN
outperforms the traditional GCN, which uses dependency parsing tools to build
the graph.Comment: CIKM-202
CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning
Joint extraction of entities and relations has received significant attention due to its potential of providing higher performance for both tasks. Among existing methods, CopyRE is effective and novel, which uses a sequence-to-sequence framework and copy mechanism to directly generate the relation triplets. However, it suffers from two fatal problems. The model is extremely weak at differing the head and tail entity, resulting in inaccurate entity extraction. It also cannot predict multi-token entities (e.g. Steven Jobs). To address these problems, we give a detailed analysis of the reasons behind the inaccurate entity extraction problem, and then propose a simple but extremely effective model structure to solve this problem. In addition, we propose a multi-task learning framework equipped with copy mechanism, called CopyMTL, to allow the model to predict multi-token entities. Experiments reveal the problems of CopyRE and show that our model achieves significant improvement over the current state-of-the-art method by 9% in NYT and 16% in WebNLG (F1 score). Our code is available at https://github.com/WindChimeRan/CopyMT
Multi-Task Learning and Improved TextRank for Knowledge Graph Completion
Knowledge graph completion is an important technology for supplementing knowledge graphs and improving data quality. However, the existing knowledge graph completion methods ignore the features of triple relations, and the introduced entity description texts are long and redundant. To address these problems, this study proposes a multi-task learning and improved TextRank for knowledge graph completion (MIT-KGC) model. The key contexts are first extracted from redundant entity descriptions using the improved TextRank algorithm. Then, a lite bidirectional encoder representations from transformers (ALBERT) is used as the text encoder to reduce the parameters of the model. Subsequently, the multi-task learning method is utilized to fine-tune the model by effectively integrating the entity and relation features. Based on the datasets of WN18RR, FB15k-237, and DBpedia50k, experiments were conducted with the proposed model and the results showed that, compared with traditional methods, the mean rank (MR), top 10 hit ratio (Hit@10), and top three hit ratio (Hit@3) were enhanced by 38, 1.3%, and 1.9%, respectively, on WN18RR. Additionally, the MR and Hit@10 were increased by 23 and 0.7%, respectively, on FB15k-237. The model also improved the Hit@3 and the top one hit ratio (Hit@1) by 3.1% and 1.5% on the dataset DBpedia50k, respectively, verifying the validity of the model
Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks
Two problems arise when using distant su-pervision for relation extraction. First, in this method, an already existing knowl-edge base is heuristically aligned to texts, and the alignment results are treated as la-beled data. However, the heuristic align-ment can fail, resulting in wrong label problem. In addition, in previous ap-proaches, statistical models have typically been applied to ad hoc features. The noise that originates from the feature extraction process can cause poor performance. In this paper, we propose a novel model dubbed the Piecewise Convolu-tional Neural Networks (PCNNs) with multi-instance learning to address these two problems. To solve the first prob-lem, distant supervised relation extraction is treated as a multi-instance problem in which the uncertainty of instance labels is taken into account. To address the lat-ter problem, we avoid feature engineering and instead adopt convolutional architec-ture with piecewise max pooling to auto-matically learn relevant features. Exper-iments show that our method is effective and outperforms several competitive base-line methods.
A word-embedding-based steganalysis method for linguistic steganography via synonym substitution
The development of steganography technology threatens the security of privacy information in smart campus. To prevent privacy disclosure, a linguistic steganalysis method based on word embedding is proposed to detect the privacy information hidden in synonyms in the texts. With the continuous Skip-gram language model, each synonym and words in its context are represented as word embeddings, which aims to encode semantic meanings of words into low-dimensional dense vectors. The context fitness, which characterizes the suitability of a synonym by its semantic correlations with context words, is effectively estimated by their corresponding word embeddings and weighted by TF-IDF values of context words. By analyzing the differences of context fitness values of synonyms in the same synonym set and the differences of those in the cover and stego text, three features are extracted and fed into a support vector machine classifier for steganalysis task. The experimental results show that the proposed steganalysis improves the average F-value at least 4.8% over two baselines. In addition, the detection performance can be further improved by learning better word embeddings.Published versio